{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "765f8416-0580-440c-8c19-a5c6012dfbe9",
   "metadata": {},
   "source": [
    "## RAPIDS Memory Manager (RMM)\n",
    "We recommend using RMM to configure GPU memory resources. For performance, a best practice with cuVS, as with other libraries in the RAPIDS ecosystem, is to use a pool memory resource to allocate a chunk of memory up front for the current GPU device."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "a7d49206-c497-46cb-b6c9-2fc721bf1692",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead.\n",
      "<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cuda module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.driver module instead.\n"
     ]
    }
   ],
   "source": [
    "import rmm\n",
    "pool = rmm.mr.PoolMemoryResource(\n",
    "    rmm.mr.CudaMemoryResource(),\n",
    "    initial_pool_size=2**30\n",
    ")\n",
    "rmm.mr.set_current_device_resource(pool)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0e53eae7-3962-4714-8c87-d919a27e9d19",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<rmm._lib.memory_resource.PoolMemoryResource object at 0x7f2fac3d2780>\n"
     ]
    }
   ],
   "source": [
    "# check the current device resource\n",
    "current_resource = rmm.mr.get_current_device_resource()\n",
    "print(current_resource)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a349fc6e-ee6a-45ed-a693-045cd8289746",
   "metadata": {},
   "source": [
    "## Building a cuVS GPU index\n",
    "With the faiss-gpu-cuvs package, the cuVS implementation is chosen by default for supported index types and can therefore be used with zero code change. Below contains an example of creating an IVFPQ index using cuVS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3ab451ca-56b5-4df7-96b4-4ddcb560b6ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "import faiss\n",
    "import numpy as np\n",
    "\n",
    "np.random.seed(1234)\n",
    "xb = np.random.random((1000000, 96)).astype('float32')\n",
    "xq = np.random.random((10000, 96)).astype('float32')\n",
    "xt = np.random.random((100000, 96)).astype('float32')\n",
    "\n",
    "res = faiss.StandardGpuResources()\n",
    "# Disable the default temporary memory allocation since an RMM pool resource has already been set.\n",
    "res.noTempMemory()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4be32183-d43c-4f2c-b1c5-23b9eef86f43",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "using ivf_pq::index_params nrows 100000, dim 96, n_lits 1024, pq_dim 96\n",
      "CPU times: user 6.11 s, sys: 614 ms, total: 6.73 s\n",
      "Wall time: 307 ms\n",
      "CPU times: user 114 ms, sys: 12.2 ms, total: 126 ms\n",
      "Wall time: 126 ms\n"
     ]
    }
   ],
   "source": [
    "# Case 1: Creating cuVS GPU index\n",
    "config = faiss.GpuIndexIVFPQConfig()\n",
    "config.interleavedLayout = True\n",
    "assert(config.use_cuvs)\n",
    "index_gpu = faiss.GpuIndexIVFPQ(res, 96, 1024, 96, 6, faiss.METRIC_L2, config) # expanded parameter set with cuVS (bits per code = 6).\n",
    "%time index_gpu.train(xt)\n",
    "%time index_gpu.add(xb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "3886cb6d-a24b-4f4d-b0a5-03b19188a059",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 2.61 s, sys: 189 ms, total: 2.8 s\n",
      "Wall time: 35.1 ms\n",
      "CPU times: user 4.67 s, sys: 329 ms, total: 5 s\n",
      "Wall time: 247 ms\n",
      "CPU times: user 107 ms, sys: 60.1 ms, total: 167 ms\n",
      "Wall time: 167 ms\n"
     ]
    }
   ],
   "source": [
    "# Case 2: Cloning a CPU index to a cuVS GPU index\n",
    "quantizer = faiss.IndexFlatL2(96)\n",
    "index_cpu = faiss.IndexIVFPQ(quantizer,96, 1024, 96, 8, faiss.METRIC_L2)\n",
    "index_cpu.train(xt)\n",
    "co = faiss.GpuClonerOptions()\n",
    "%time index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu, co)\n",
    "\n",
    "# The cuVS index now uses the trained quantizer as it's IVF centroids.\n",
    "assert(index_gpu.is_trained)\n",
    "%time index_gpu.add(xb)\n",
    "k = 10\n",
    "%time D, I = index_gpu.search(xq, k)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea37b4e6-c67f-4ef5-a90a-79f6bf6115d5",
   "metadata": {},
   "source": [
    "## Build a cuVS CAGRA index\n",
    "The following example demonstrates building and searching the CAGRA index with FAISS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "05a34ea2-8597-4758-8b0d-c1830c32f018",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "using ivf_pq::index_params nrows 1000000, dim 96, n_lits 1000, pq_dim 24\n",
      "CPU times: user 5min 20s, sys: 15.7 s, total: 5min 36s\n",
      "Wall time: 4.94 s\n",
      "CPU times: user 817 ms, sys: 52.7 ms, total: 869 ms\n",
      "Wall time: 12.5 ms\n"
     ]
    }
   ],
   "source": [
    "# Step 1: Create the CAGRA index config\n",
    "config = faiss.GpuIndexCagraConfig()\n",
    "config.graph_degree = 32\n",
    "config.intermediate_graph_degree = 64\n",
    "\n",
    "# Step 2: Initialize the CAGRA index\n",
    "res = faiss.StandardGpuResources()\n",
    "gpu_cagra_index = faiss.GpuIndexCagra(res, 96, faiss.METRIC_L2, config)\n",
    "\n",
    "# Step 3: Add the 1M vectors to the index\n",
    "n = 1000000\n",
    "data = np.random.random((n, 96)).astype('float32')\n",
    "%time gpu_cagra_index.train(data)\n",
    "\n",
    "# Step 4: Search the index for top 10 neighbors for each query.\n",
    "xq = np.random.random((10000, 96)).astype('float32')\n",
    "%time D, I = gpu_cagra_index.search(xq,10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbcc7bfe-9ad6-4ffe-8d7b-7e6a2e309dd2",
   "metadata": {},
   "source": [
    "## CAGRA to HNSW\n",
    "A CAGRA index can be automatically converted to HNSW through the new faiss.IndexHNSWCagra CPU index class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "2c87aa26-7aae-42f4-8d89-7324e7a3615e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1min, sys: 3.54 s, total: 1min 4s\n",
      "Wall time: 1.58 s\n",
      "CPU times: user 2min 59s, sys: 2.88 s, total: 3min 2s\n",
      "Wall time: 2.93 s\n"
     ]
    }
   ],
   "source": [
    "# Create the HNSW index object.\n",
    "d = 96\n",
    "M = 16\n",
    "cpu_hnsw_index = faiss.IndexHNSWCagra(d, M, faiss.METRIC_L2)\n",
    "# Create the full HNSW hierarchy\n",
    "cpu_hnsw_index.base_level_only=False\n",
    "\n",
    "# Initializes the HNSW base layer with the CAGRA graph. \n",
    "%time gpu_cagra_index.copyTo(cpu_hnsw_index)\n",
    "\n",
    "# Add new vectors to the hierarchy.\n",
    "newVecs = np.random.random((100000, 96)).astype('float32')\n",
    "%time cpu_hnsw_index.add(newVecs)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
